arXiv:1504.00522vl [cs.RO] 2 Apr 2015 


Monte Carlo Localization in Hand-Drawn Maps 


Bahram Behzadian Pratik Agarwal 


Abstract —Robot localization is a one of the most important 
problems in robotics. Most of the existing approaches assume 
that the map of the environment is available beforehand and 
focus on accurate metrical localization. In this paper, we address 
the localization problem when the map of the environment 
is not present beforehand, and the robot relies on a hand- 
drawn map from a non-expert user. We addressed this problem 
by expressing the robot pose in the pixel coordinate and 
simultaneously estimate a local deformation of the hand-drawn 
map. Experiments show that we are able to localize the robot 
in the correct room with a robustness up to 80%. 

1. Introduction 

Localization, the problem of estimating a robot pose in 
the environment, is probably one of the most studied and 
understood problems in mobile robotics. Several solutions 
have been presented in the last decades, most of them based 
on probabilistic inference over the space of possible robot 
configurations (221 . Although the existing approaches have 
been demonstrated to be robust, efficient and very accu¬ 
rate El 13 da, they mostly rely on one major assumption: 
The existence of an already available map of the environ¬ 
ment, built beforehand with the same sensing modality of 
the robot. 

In some circumstances, however, building such a map 
could be a nuisance for the user. This is the case, for instance, 
of vacuum cleaners, lawn mowers, pool cleaners and many 
other service robots. Often, when the user buys such a 
robot, he or she wants to use it immediately without waiting 
for an expensive and time-consuming mapping routine to 
complete. In some other cases, building an a-priori map is 
not even possible, for instance in environments that may 
harm humans, e.g., a mine field or a toxic factory. Moreover, 
mapping algorithms may result in local minima and the 
resulting maps might be unusable for navigation. Although 
automatic tools to detect such inconsistencies exist C3, they 
require an expert user to analyze the data to correct the map. 

In this paper, we address the localization problem when no 
map of the environment is available beforehand. We propose 
an algorithm that solely requires a hand-drawn map of the 
environment, sketched by the user. We believe that drawing 
such a map puts no burden on the user and is an intuitive task. 
Furthermore, we do not assume that the map is metrically 
accurate nor proportional up to a single scale. Objects might 
be missing, and the deformation of the map might change 
at different locations. This reduces the ability to accurately 
localize the robot in a metric sense, since the map is bended 
in different ways and distances are not respected. To address 
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Fig. 1. Occupancy grid of the dataset FR079 (top), built using a SLAM 
algorithm, an hand-drawn map used to localize the robot (middle) and their 
overlay (bottom). 

these problems, we extend the Monte Carlo localization 
algorithm in two ways. First, we express the robot pose in the 
pixel coordinate of the drawing. This resolves the metrical 
issues and provides a sort of normalization with respect to the 
deformation. Second, we augment the state space of the robot 
with a deformation parameter and track the local deformation 
of the drawing over time. 

II. Related work 

Robot localization is a widely studied problem ^T2\ 
and several approaches have been proposed in the past, 
such as Markov localization 0, Monte Carlo localization 
(MCL) O, and multiple hypothesis tracking (MHT) 
Those approaches rely on the existence of a prior map 
and a static environment. Some researchers extended those 
approaches to be able to handle changes in the environment 
over time (23l [I2l0. However, they still rely on metrical 
maps and some prior information of the the environment to 
































be built beforehand. 

Few works have been done with respect to localization 
without prior maps. Koenig and Simmons lEJ propose 
a localization approach, where the user provides only a 
topological sketch of the environment. Some authors used 
floor plan maps available from construction blueprints. Ito 
et al. HI propose a localization algorithm that relies on 
blueprints of the environment. They employ an RGB-D 
sensor to estimate the walls of the environment and match 
them to the floor plan. To improve the initialization, they 
rely on a map of WiFi signal strengths. Setalaphruk et al. 
Cll employ a multi-hypothesis Alter on blueprints. They 
first compute a Voronoi diagram of the floor plan and store 
its intersections as landmarks. Second, they use the Voronoi 
intersection from the current readings and match them to the 
one computed from the map. Contrary to our approach, they 
assume that the blueprint of the environment is metrically 
correct, and its scale is known. 

Most of the works using hand-drawn maps exploit them 
as a mean of communication between the robot and a human 
operator. Kawamura et al. cni present an approach where 
an operator first sketches a rough map of the environment 
with a set of way points to follow. They use a library of 
known objects to associate perceptions to the sketch map 
and triangulate the robot position using angle measurements. 
Skubic et al. ED Propose a sketch interface for guiding a 
robot. The user draws a sketch of both the environment and 
instructs the robot to go to certain locations. The system 
computes a path in the hand-drawn map, and the robot 
execute it in an open-loop controller, without localizing the 
robot. Shah et al. propose a similar approach. The 
focus of the work, however, is on how to extract qualitative 
instructions for the robot. They translate the instructions to 
robot commands and localize the robot using an overhead 
camera system. Yun and Miura ll24ll proposed and evaluated 
a quantitative measure of navigability on hand-drawn maps. 
The work only focuses on qualitative aspects of naviga¬ 
bility but does not address the ability to localize in them. 
Moreover, the sketch maps that they considered are made 
of line segments and are automatically generated. Skubic 
et al. 1^ propose an approach to qualitative reason about 
sketch maps. The authors are able to extract information 
such as an obstacle is on the right of the robot and give 
qualitative commands as turn left or go straight. The work 
has been extended by Chronis and Skubic 121 with the 
use of reactive controllers to guide the robot. Forbus et al. 
Q developed nuSketch, a qualitative reasoning framework 
exploiting topological properties and Voronoi diagrams. They 
provide answers to query like finding places with certain 
properties or properties of paths. Our work is orthogonal 
to them, and the proposed localization algorithm can be 
integrated with any of those control interfaces. 

Localization using hand-drawn maps has received very 
little attention from the robotics community. Parekh et al. 
ca presented a technique to perform scene matching be¬ 
tween a map of the environment and a sketch using spatial 
relations. They assume a set of objects is present in both 


the sketch and the map with known segmentation. They 
then use particle swarm optimization (PSO) techniques to 
compute the alignment and the sketch deformation. Matsuo 
and Miura ca extended the previous work in a simultaneous 
localization and mapping (SLAM) framework. They assume, 
however that the sketch map is composed of rectangles 
corresponding to building in the scene. Shah and Campbell 
03 present an algorithm for controlling a mobile robot 
using a qualitative map consisting of landmarks and path 
waypoints. They assume that the both the sketch map and 
the real environment is made of point landmarks and also 
assume known data associations between them. In contrast 
to them, our approach does not make any assumption on the 
format of the sketch map and treats it as a raster image. 
Moreover, we do not attempt to transform the hand-drawn 
map to reflect the real world but we directly localize the 
robot in the hand-drawn map. To the best of our knowledge, 
the proposed approach is the first attempt to localize a robot 
from a generic hand-drawn map with no further assumptions. 


III. Localization in Hand-Drawn Maps 

In this section, we describe the extension we made in the 
original Monte Carlo localization algorithm 0 for localizing 
in hand-drawn maps. We propose two main extensions. First, 
we augment the state space of the robot with an additional 
variable that represents the local deformation of the map. 
Second, we localize the robot in the pixel coordinate frame 
of the map, instead of the world coordinate frame. In order 
to do so, we extended both the motion and the observation 
model to be locally projected onto the hand-drawn map. 


A. Monte-Carlo Localization in Pixel Coordinates 

The goal of our work is to estimate the pose of the robot 
xt G SE{2) and a local scale s G M at time t, given 
the history of odometry measurements ui:t and observations 
zi-t. Formally, this is equivalent to recursively estimate the 
following posterior: 


p{xt,St I Zi:t, Ui,t,m) = r)p{zt \ Xt,m) 

/ P{st I S(_l)p(xt I X(_i, S(_i, U() ■ 
p(X(_i,S(_i I dX(_idSt_i, (1) 


where is a normalization factor and m is the hand-drawn 
map. The motion model p(xt | x^-i, 5t-i, denotes the 
probability that the robot ends up in state x^ given it executes 
the odometry readings Ut in state Xt_i. The distribution 
p{st I St-i) represents the Brownian motion of the scale 
parameter and the observation model p{zt \ yit^St^m) 
denotes the likelihood of making the observation Zt given 
the robot’s pose x^, the local scale St, and the map m. 

Following the MCL approach, we approximate the distri¬ 
bution as a weighted sum of Dirac delta functions. The recur¬ 
sive estimation is performed using the sequential importance 
resampling algorithm (41. For the proposal, we sample the 
pose and the scale from the motion model and the Brownian 
process, respectively. Under the chosen proposal distribution, 
we compute the weight of the particle according to the 


observation model and the recursive term. The particle set is 
then resampled, at each iteration, according to the weight of 
the particles. 

To compute the number of particles needed, one can 
use the KLD sampling approach of Fox O. The algorithm 
estimates a discrete approximation of the target distribution 
using the weighted set of particles. During resampling, 
it computes the Kullback-Leibler divergence each time a 
new particle is resampled and stops the process when the 
divergence is below a confidence level. 

B. Proposal Distribution 

The purpose of the proposal distribution is to provide a 
mean for sampling a new set of particles given the current 
one. In the original MCL algorithm, the proposal distribution 
is the robot motion model. In our work, we need to provide 
a proposal distribution for both the robot position x and 
the local scale s. We modified the original motion model 
describe in the MCL algorithm to project the motion of 
the robot in the image coordinates. Let xj and s\ the pose 
and scale associated with the i-particle at time t and Uf the 
incremental odometry measurement. The new pose of the 
particle is computed as follow 


Xt+i = 


S (u( © e) 


C. Observation Model 

After we sample the particles according to the proposal 
distribution, we follow the importance sampling principle 
and compute the weights of the particle set. With our choice 
of proposal, the weight of each particle must be proportional 
to the observation model p(z | x, s,m), where we omitted 
the time index for notational simplicity. Intuitively, this 
model describes the likelihood of the measurement z, given 
the hand-drawn map m, the local scale s and the pose of the 
robot in the image coordinates x. In this work, we consider 
2D laser range finders as sensors. The measurements z = 
[zi, • • • ,zk]^ consist of a set of range values along a set of 
corresponding directions a = We employ 

the beam based model ea and modify it to project the 
observations in the hand-drawn map. 

Formally, let Zi be the i-th range measurement along the 
angle ai . Let us trace a ray on the map m from the robot pose 
X to the closest obstacle in the map and be z the measured 
distance. The original formulation of the beam based model 
considers z as being expressed in world coordinates and 
describes the measurement distribution as a mixture of four 
components 
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where Ut represents the odometry reading and e is a sample 
from the noise term. We sample e = [ qj $1 ] from 
the normal distribution and a wrapped normal distribution, 
respectively for the translation and the rotational 01 part. 

q - V(0,I]q) (3) 

Ot - >VV(0,<T^), (4) 

where Sq and ao are the covariance matrix for the translation 
and the standard deviation for the rotation. The scale follows 
a Brownian motion and is sampled according to following 
process 

sj+i = si+ e^ e -- Af{0, a^), (5) 

where e* is a sample from a normal distribution and as is 
the standard deviation of the scale increment. Note that we 
include a standard deviation term in the Wiener process. 
This is to account for smaller variations than its original 
formulation. One can formulate Eq. using the standard 
formulation of the Wiener process by including an additional 
scaling term to the e\ 

Intuitively, given the pose of a particle and its estimated 
scale, we first sample a zero mean, SE{2) transformation 
from the odometry noise and perturb the odometry measure¬ 
ment accordingly. We then project the perturbed odometry on 
the hand-drawn map and apply the projected transformation 
to the robot pose. The scale sampling has been chosen to 
be locally close to the scale of the previous step, but still 
able to explore the whole space. The Brownian motion was 
a natural choice for that, given its statistical properties. 


where /hit models the measurement noise, /dyn models the 
unexpected obstacles not present in the map, /max models 
the sensor maximum range, and /md models a uniformly 
distributed random measurement. The functions are defined 
as following 
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Here, TEXP(x;A,a) denotes a truncated exponential dis¬ 
tribution with parameter A and support between 0 and a. 
U{x;h^ c) denotes a uniform distribution between b and c and 
6 is Si window parameter. To account for the deformations, 
we need to project the real measurements coming from 
the sensor to the image coordinate frame by applying the 
estimated transformation. In our case, this entails to scale 
all the ranges according to estimated scale s. The resulting 
observation model is 

p{Zi \ s,Zi) =p{^ \ Zi) (11) 

All the parameters of the model have been learned from 
real data. To collect the data for the learning phase, we 
positioned the robot at fixed locations in the environment 
and draw few different sketches for each location. Then, for 
each sketch and location, we performed grid search to find 
the best scale. Given the scale, we computed the maximum- 
likelihood estimate of the parameters, following the approach 
described in (221. 









Fig. 2. Simulated Apartment (left) and Room (right) environment in 
Gazebo. The area covered by the simulated laser scanner is shown in blue. 

IV. Experimental Results 

We evaluated our approach in both simulated and real 
environments. For the simulation, we used the Gazebo sim¬ 
ulator available in ROS. We created two virtual environ¬ 
ments, resembling a simulated room (Room) and a simulated 
apartment (Apartment). Figure depicts the two simulated 
environments together with a simulated robot. In the real 
world experiment, we tested our algorithm in our building 
(FR079), whose map and a full dataset is publicly available. 
We chose this dataset for two reasons. First it contains 
very challenging aspects of localization algorithms, given the 
presence of many, similarly looking rooms. Second, similar 
data is publicly available online, and other researchers can 
use that to replicate our results. Since not everyone has the 
same artistic capabilities, we asked nine people to walk in 
the environment and draw a sketch. Figure shows all the 
nine sketches together. 

We use the same parameters for all our experiments 
and the simulated robot. For the proposal distribution, we 
have Sq = 0.1/ as covariance matrix for the translational 
component, (Tq = 0.05 for the rotational noise, and (Tg =0.1 
for the scale noise. With respect to the observation model, we 
set (7;^ = 0.1 according to our sensor specification, and we 
estimated the rest of the parameters from data. The resulting 
estimated values are A = 0.1 for the exponential distribution, 
d = 0.01 for the max range, and rchit = 0.005, w^yn = 0.5, 
f^max = 0.3, fUrnd = 0.4 for the mixture weights. We also 
subsampled the range measurements to only 10 beams per 
scan, to compensate for the independence assumption of the 
laser beams. We intialized the filter by drawing the particles 
uniformly over a region of the map drawn by the user, for the 
position, uniformly between — tt and tt for the orientation, 
and uniformly between 0.01 and 1 for the scale. The square 
region was about the size of a room, simulating a plausible 
initial guess from the user. 

A. Simulated Experiments 

For the simulation, we used the Room environment as a 
proof of concept scenario. We let the robot start in the lower 
right corner of the room and performed 4 different paths. 
We simulated a weak prior on the initial robot position by 
sampling the particles uniformly in a square of 150 x 150 
centered at the robot position. We obtained a success rate 
of 100% in localizing the robot. Some videos of the whole 
experiment are available on Youtub^ 

^https://www.youtube.com/playlist?list=PL2DAq2wc_ 
IgJnSTYusQjeck-gc3fc2 jtP 



Fig. 3. Hand-drawn map of the Apartment (left) and the Room (right) 
environment. The dashed squares represents the rooms we used in our 
experiment. 



Fig. 4. Overlay of the hand-drawn map over the Apartment environment. 
The hand-drawn map has been manually aligned. Note the non uniform 
scaling and the distortions present. 

For the second experiment in the Apartment environment, 
we simulated a series of navigation tasks, where the robot 
starts in a certain location and is instructed to reach a 
specific room. We believe this is the natural application of 
our approach, where the user sketches the map, a rough 
location of the robot and ask the robot to reach a particular 
showing it on the map. We set up our experiments in the 
following way. After we draw a sketch of the environment, 
we subdivided it into small square, each representing a room 
to be reached. We then randomly generated 10 different 
sequences navigation tasks, in the form of go from room 
A to room B. For each sequence, we performed 10 runs of 
our localization algorithm with different random seeds. We 
considered a sequence as a success if the robot, at the end 
of the trajectory, is localized in the correct room. 

Figure shows the hand-drawn map used in this exper¬ 
iments, together with our subdivision. To understand the 
differences between the real map and the hand-drawn one. 
Figure shows an overlay of the two, where we manually 
rotate, scaled and aligned the two maps. Even under manual 
alignment, one can see that the scaling of the sketch is not 
uniform with respect to the real map and that many distor¬ 
tions are present. Table shows the results of the experiment, 
together with the sequences we used. Our approach has an 
overall success rate of 93%. We only had a few failures in 
the paths from 10 to 6 and 13 to 6. Note that this are the 
most challenging paths, since are the longest and traverse 
the whole map. We also had some problems from 13 to 10. 






































room a —)► b I Chance of Success 


1 ^ 6 

100% 

1 ^ 10 

100% 

6 ^ 1 

100% 

6 ^ 10 

100% 

8 ^ 1 

100% 

8^6 

100% 

10 ^ 1 

100% 

10 ^ 6 

70% 

13 ^ 6 

80% 

13 ^ 10 

80% 


Total I 93% 

TABLE I 

Success rate eor the Apartment environment 

This was due to the ambiguity in the two corners, where the 
robot was mistakenly localized in the wrong one. 

All the trajectories for this experiment are publicly avail¬ 
able on youtub^ 

B. Real World Experiments 

Figure shows the hand-drawn maps of Building 079 used 
for this experiment. The numbers in the figure represent the 
different rooms we identified in the environment. In a way 
similar to the simulated experiments, we randomly generate 7 
navigation sequences and, for each sequence, we performed 
10 runs of our localization algorithm with different random 
seeds. Table [II| shows the localization results with the real 
data for each run. The ratio difference denotes the absolute 
difference between the ratio (length/width) of the original 
occupancy grid map and each hand-drawn map. Figure 
illustrates the success rate as a function of the difference in 
ratios. We see that localization has a high success rate, almost 
80%, when the difference in the ratio of the hand-drawn map 
is relatively low. The success rate of localization decline, 
when this difference increases. The table shows the highest 
failure in the test run from room 9 to 12. Room number 
9 is fully occupied with furniture that heavily distorted the 
image of the walls in the laser scan, therefore, the robot 
was not able to localize properly in the beginning. The robot 
randomly localized itself in any of the other rooms looking 
alike. The lowest successful rate was obtained when using 
the map No. 2. The user has drawn the doors in an unusual 
way that the robot can not recognize the entrance and exit 
properly. The localization results for AIS map-Cj^and map-^ 
and is publicly available on youtube. 

In addition to FR079, we also tested our method on the 
Intel dataset. The videos from the Intel dataset are publicly 
available on youtub^ 

^ https://www.youtube.com/playlist ?list=PL2DAq2wc_ 
lgI4x4TKlPQ0sovRKk-990wS 

■^https : //www. youtube . com/playlist ?list=PL2DAq2wc_ 
lgJu9ftPMaozs0f19kwrkXCZ 

^ https://www.youtube.com/playlist ?list=PL2DAq2wc_ 
IgJdz fJJDNUNvomOUWCUSMz 4 

'https://www.youtube.com/watch?v= 
uQhKl9 jpa2I&index=2&list=PL2DAq2wc_ 

Igl9JDwGfDLvZVds_vwpk30S 



Fig. 6. Percentage of FR079 runs successfully localized. 

V. Discussion 

We believe our approach will also work with blueprints of 
the environment, given their metrical accuracy. 

VI. Conclusions 

In this paper, we addressed the problem of robot localiza¬ 
tion when no accurate map of the environment is available 
and the robot has to solely rely on a hand-drawn map 
sketched by the user. To do so, we extended the classical 
Monte Carlo localization algorithm in two ways. First, we 
propose to localize with respect to the image coordinate 
frame. Second, we track, together with the pose of the robot, 
a local deformation of the hand-drawn map. Since no metric 
information is available on the hand-drawn map, we propose 
to evaluate the localization in terms of coarse localization 
at the level of rooms of the environment. We evaluated 
our approach in both simulated and real environments and 
achieved a correct localization, up to the room level, of about 
80% of the cases when the ratio of the sketch map resembles 
the real environment. We believe this is a starting point 
that addresses a very challenging problem with potential 
applications. In future, we plan to extend our approach to 
incorporate more sophisticated distortion models and employ 
it for navigation purposes. 
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